Non-redundant Subgroup Discovery Using a Closure System
نویسندگان
چکیده
Subgroup discovery is a local pattern discovery task, in which descriptions of subpopulations of a database are evaluated against some quality function. As standard quality functions are functions of the described subpopulation, we propose to search for equivalence classes of descriptions with respect to their extension in the database rather than individual descriptions. These equivalence classes have unique maximal representatives forming a closure system. We show that minimum cardinality representatives of each equivalence class can be found during the enumeration process of that closure system without additional cost, while finding a minimum representative of a single equivalence class is NP-hard. With several real-world datasets we demonstrate that search space and output are significantly reduced by considering equivalence classes instead of individual descriptions and that the minimum representatives constitute a family of subgroup descriptions that is of same or better expressive power than those generated by traditional methods.
منابع مشابه
Discovering statistically non-redundant subgroups
The objective of subgroup discovery is to find groups of individuals who are statistically different from others in a large data set. Most existing measures of the quality of subgroups are intuitive and do not precisely capture statistical differences of a group with the other, and their discovered results contain many redundant subgroups. Odds ratio is a statistically sound measure to quantify...
متن کاملNon-redundant Subgroup Discovery in Large and Complex Data
Large and complex data is challenging for most existing discovery algorithms, for several reasons. First of all, such data leads to enormous hypothesis spaces, making exhaustive search infeasible. Second, many variants of essentially the same pattern exist, due to (numeric) attributes of high cardinality, correlated attributes, and so on. This causes top-k mining algorithms to return highly red...
متن کاملGroups in which every subgroup has finite index in its Frattini closure
In 1970, Menegazzo [Gruppi nei quali ogni sottogruppo e intersezione di sottogruppi massimali, Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. 48 (1970), 559--562.] gave a complete description of the structure of soluble $IM$-groups, i.e., groups in which every subgroup can be obtained as intersection of maximal subgroups. A group $G$ is said to have the $FM$...
متن کاملComparative analysis of profit between three dissimilar repairable redundant systems using supporting external device for operation
The importance in promoting, sustaining industries, manufacturing systems and economy through reliability measurement has become an area of interest. The profit of a system may be enhanced using highly reliable structural design of the system or subsystem of higher reliability. On improving the reliability and availability of a system, the production and associated profit will also increase. Re...
متن کاملMTBF evaluation for 2-out-of-3 redundant repairable systems with common cause and cascade failures considering fuzzy rates for failures and repair: a case study of a centrifugal water pumping system
In many cases, redundant systems are beset by both independent and dependent failures. Ignoring dependent variables in MTBF evaluation of redundant systems hastens the occurrence of failure, causing it to take place before the expected time, hence decreasing safety and creating irreversible damages. Common cause failure (CCF) and cascading failure are two varieties of dependent failures, both l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009